When AI Meets Data: The Promise and the Pressure of Bringing AI into Higher Education Systems

min read


The implications of AI for data governance and security don’t often grab the headlines, but the work of incorporating this technology into institutional culture is vital, complex, and illuminating.

Credit: RDVector / Shutterstock.com © 2026

When artificial intelligence (AI) began to capture our imagination, most conversations in higher education centered on pedagogy, student engagement, and the future of teaching. Far less attention was given to what happens when AI reaches the operational heart of the institution—data systems, workflows, and reporting infrastructure. At Rowan University, as at many institutions, we began experimenting with AI in small, controlled ways: a chatbot that could answer questions about course availability, an analytics agent that could interpret enrollment trends and answer simple data questions. Each of these efforts seemed modest, even obvious. But together, they exposed a deeper truth: Bringing AI into real, regulated institutional data environments is not just a technology project; it’s a test of our entire data governance and security philosophy.

The AI conversation, however, often stops at the surface—new tools, new interfaces, and new opportunities. What many technology professionals are only beginning to realize is that embedding AI into organizational data systems is not an experiment in innovation but an exercise in institutional self-discovery. When we first began exploring using AI in our data environment, we imagined faster insights, quick and automated actions, and more easily accessed reporting. Faculty might be able to ask for enrollment by department; administrators could check retention by cohort; advisors could retrieve course information through natural-language queries. Those goals seemed straightforward. But as we experimented, we began to uncover something deeper, seeing how our university truly operates, how the same data can mean different things to different groups, and how much our work depends on knowledge that is mostly not formally documented anywhere. Our systems are built on context, experience, and interpretation. Whereas people instinctively understand what certain terms or trends mean within that context, AI does not. It must be taught those meanings deliberately.

The Hidden Complexity of Institutional Data

Colleges and universities sit on vast amounts of data, including student records, financial aid information, course schedules, employee payroll records, research activity data, and facilities systems information. On paper, having this much data across so many domains should be manageable, particularly given the significant advancements in modern data platforms, cloud infrastructure, and artificial intelligence. In practice, institutional data behave like dialects of a shared language: familiar yet never quite identical. The result is a federation of siloed data definitions, legacy applications, and human workflows stitched together through trust and institutional memory.

At Rowan, most of the data live in Oracle-based enterprise systems that have evolved over decades. These systems are stable, secure, and deeply customized but were not designed to include an AI layer. Every time we tried to connect AI to them, we ran into the same paradox: The data were technically accessible but semantically ungoverned. A simple prompt such as “Show me enrollment” produced a result that was mathematically correct but institutionally meaningless—the AI added enrollment counts across all academic years and proudly announced, “Your institution enrolled 600,000 students.” The number was not wrong in computation; it was wrong in context.

We soon discovered that every familiar campus term hides multiple definitions. Consider the word “freshmen.” In some offices, it refers to students classified by credit hours. In others, it refers to an admit type—first time in college. For the registrar, it might mean an incoming fall cohort for reporting. Each definition is legitimate, but which one should the AI apply when a user asks about enrollment? Even temporal questions cause trouble. When someone asks, “Show me enrollment by year,” humans know to clarify whether the intent is fiscal year, academic year, or calendar year. AI does not. Unless explicitly told, it randomly picks one.

These examples revealed the uncomfortable truth that our data were never designed for machine interpretation—they depended on institutional memory and the unspoken understanding that “everyone knows what we mean.” AI stripped away that safety net. It forced us to confront how much of our data governance relied on human context rather than explicit rules. Our efforts to provide AI with access to the university’s various data sources and systems taught us valuable lessons, both about AI and about the way institutional processes and structures really work.

Lesson 1: Context Is the New Security

Our first instinct was to treat AI as a security problem—prevent unauthorized access, control queries, and monitor logs. Those were necessary steps, but they didn’t solve the core issue. The real vulnerability was contextual.

A human analyst who sees “enrollment = 600,000” immediately knows something is wrong. An AI model does not. It lacks the institutional judgment that comes from experience and policy awareness. Without built-in context, even secure systems can produce misleading or damaging results. To address this, we built curated data products that mirror our institutional databases but that contain only vetted, policy-aligned information. Each dataset carried definitions, valid values, and ownership metadata. When AI interacted with data, it could only access these curated products, not raw transactional tables. The result was not only queries that were safer (because they were subject to access restrictions) but answers that were more accurate and trustworthy.

We also began encoding context directly into prompts and data definitions. For instance, when a user asked about “enrollment,” the system automatically broke the query down by academic term and applied approved census logic. The AI wasn’t just retrieving data; it was reasoning within the guardrails of institutional meaning. That approach eliminated many early errors and helped align responses with official reporting standards. What started as a technical safeguard evolved into a semantic one. We learned that context itself is a security control. Protecting meaning is as important as protecting access.

Lesson 2: Security Is a Tightrope

Beyond semantics, the most visible challenge was security. We had to decide whether an AI system would be allowed to query or act on live production databases. The technical answer was “yes”; the responsible answer was “no.” Opening a direct text-to-SQL channel from an AI model to an enterprise database might seem efficient, but it’s a security nightmare. Every prompt becomes a potential vector for injection attacks. Every generated query has the potential to bypass role-based access controls or expose sensitive student data.

Even read-only systems can pose problems. A single miswritten query can pull tens of thousands of records, breaching the principle of least privilege. More dangerously, a poorly tuned AI interface could interpret a user’s natural language as an instruction to change something, register a course, drop a student, or update a grade, all without the necessary human checks. Through that experience, we eventually learned that the real question wasn’t “Can AI talk to the database?” but “What should AI be allowed to know and through what channel?”

Just as important are the questions of how much access should AI have and what access it should explicitly never have. Those questions became the foundation of our governance model, and we are still learning the answers. Each new use case forces us to revisit boundaries we thought were settled. A query that feels safe in one context can raise compliance or ethical concerns in another. Governance, we’ve realized, is not a checklist but a living practice that must evolve alongside technology.

When we first began experimenting with AI on institutional data, our instinct was to start simple: Connect the model conceptually, not physically. The goal was to let the AI reason about our data without ever touching the database itself. Our earliest prototype used a metadata-driven design. The AI model ingested schema information, table names, column descriptions, relationships, and definitions. It used that metadata to generate SQL queries but never executed them. Instead, those queries were passed through our application layer, which validated them, enforced security policies, and executed them through a controlled service account with read-only access.

Technically, this separation worked well. It insulated the model from the database while giving users natural-language access to data. The real challenge came in managing performance, scalability, and validation. Some AI-generated queries were syntactically correct but computationally expensive, joining large tables or pulling far more data than intended. The problem wasn’t the AI’s intent but how easily a seemingly valid query could stress infrastructure or return misleading aggregates if not tightly governed.

Because our enterprise data remain on-premises, any analytics or AI platform must connect to Oracle through a secure VPN tunnel. That tunnel is essential; it is the approved bridge between our cloud analytics environment and campus systems. But a VPN alone is transport, not governance. Without oversight above it, even an encrypted connection can become an open door.

Lesson 3: Architecture Shapes Behavior

Early on, we also faced a structural dilemma. To make AI-generated queries executable, we initially built flat views with pre-joined tables, assuming simplicity would help. Instead, it created rigidity, which showed up most clearly in scheduling data. To illustrate the problem, consider this prompt: “Find courses taught on Wednesday and Thursday.” Our normalized data model stored each course meeting pattern as a separate row—one for each day, time, and location—and filtering for Wednesday or Thursday was straightforward. In a flattened table, however, days became columns day 1, day 2, day 3, and so forth, and each had to be linked to the correct time and building. A simple filter became an elaborate condition spanning multiple columns. Performance slowed and maintenance overhead grew each time a new pattern or field was added.

We eventually realized that keeping data normalized and relational and performing joins dynamically through a governed semantic layer preserved both flexibility and accuracy. Queries could adapt to user intent filtering, grouping, or joining in real time without forcing the underlying structure into a brittle, column-based shape. The normalized model also aligned better with institutional governance because definitions and relationships could evolve without breaking every downstream view.

In our later design, that semantic layer evolved into what we now describe as an agentic semantic layer, a governed environment that not only stores relationships and measures but also instructs AI models on how to interpret them responsibly. It maps institutional language to canonical fields, enforces joins and filters, and provides query feedback loops so that computational efficiency and data integrity are preserved. This approach allowed us to define joins, filters, and permissions inside a secure semantic framework. The AI interacts with this governed layer, and that layer interacts with our on-prem database through the VPN connection. All queries run within curated datasets that reflect institutional policy and performance thresholds.

By separating query generation from query execution, and by embedding institutional logic within the semantic layer, we maintained both control and agility. The system became intelligent in behavior yet compliant and stable in structure and able to “reason” about data safely without disrupting operations.

The insight was simple but lasting: AI does not need unrestricted access to our databases; it needs governed access to trusted data.

Lesson 4: A Blank Slate Can Be Misleading

When we first built a chatbot to help users find course information, we quickly learned that a blank text box invites infinite expectations. To many students and staff, it looked like an open portal to everything the university “knew.” In reality, our course chatbot—which we dubbed the Section Tally chatbot—was designed for a very specific purpose: to help users explore course offerings, schedules, and section availability drawn from the registration system. Without clear context, users didn’t know that. We received every kind of question imaginable—“Which professors give good grades?” and “Which courses are tied to my program?”—queries the system was never built to answer. Others went further, asking it to predict next year’s enrollment or run statistical models on retention. The AI was not failing; the expectations were simply misplaced.

We realized that transparency had to start before the first question was even typed. We added instructions, sample prompts, and guidance right inside the chat interface listing what the chatbot could answer, what data columns it drew from, and where its “knowledge” stopped. We also clarified that it worked only with course-level and schedule information, not with student or faculty data. Even with those cues, however, a few misunderstandings persisted. Some students continued to ask questions involving data we had deliberately excluded for privacy or compliance reasons. Each of those moments became an opportunity for valuable feedback, a reminder that building trust with AI is not about expanding capability but about making boundaries visible and understandable.

Over time, the Section Tally chatbot evolved from an empty prompt to a guided conversation. Instead of appearing to “know everything,” it began to explain itself, telling users why certain questions were out of scope and where to find official answers elsewhere. That shift changed perception. What once felt like limitation started to feel like reliability. We came to believe that, rather than trying to answer every question or even most questions, a responsible AI system should “know” when to stay silent.

Sometimes, saying nothing at all is the most trustworthy response.

Lesson 5: Fine-Tune the Model before Trusting It

One of the most important lessons we learned is that you cannot trust an untuned model with institutional data, no matter how sophisticated the model appears to be. Off-the-shelf large language models are trained for general understanding, not institutional precision. They can generate SQL or explain metrics convincingly, but without grounding in our definitions and policies they risk producing answers that are fluent yet fundamentally wrong.

Fine-tuning is not about making the AI sound smarter. It is about making it institutionally correct. Before releasing any system to campus users, we perform targeted fine-tuning and prompt calibration so that the model properly incorporates our vocabulary, data relationships, and governance rules. This ensures that when someone asks for information related to enrollment, retention, or financial aid, the model interprets those terms using approved institutional definitions.

During early testing of our course chatbot, a staff member prompted the system to “Show me all undergraduate courses.” The AI pulled data using the course-level field, which had a value “UG.” The user objected, “No, you cannot use that; you have to use the section attribute instead.” Both fields existed, but only one carried institutional meaning. That single exchange exposed a deeper governance problem: If two sources encode the same concept differently, only one should feed the model. Redundant or unused fields create semantic noise that confuses both AI and humans. We subsequently audited our schema, removing unused or legacy columns from the AI training corpus so the model would only learn from authoritative fields.

A similar issue surfaced when a department added prerequisite notes inside a free-text “comments” field even though a structured “prerequisite” column already existed. The model, unaware of this human workaround, relied on the structured prerequisite column and found it empty. Technically, it was correct; semantically, it missed the real data hidden in text. That incident highlighted the deeper, long-standing challenge that data practices evolve faster than data cleanup. Much of our institutional data have been stored this way for years, blending structure with narrative. We cannot fix this problem overnight, but we are steadily moving toward resolution, redefining standards, cleaning legacy records, and reinforcing habits that separate data from commentary. Over time, these incremental changes will ensure that models learn from structured truth rather than inherited inconsistency. In the end, protecting access is only part of the challenge; protecting meaning, structure, and consistency is what truly determines whether AI “understands” the institution.

Of course, you cannot fine-tune for every possible question users might ask. The universe of potential queries is infinite. But by doing a thorough job upfront and covering the majority of high-frequency and high-impact questions, we dramatically reduce the risk of confusion or misinformation. For the inevitable edge cases, we maintain a feedback loop that captures real user questions and model responses. Every ambiguous or incorrect interaction becomes a learning artifact. We review these periodically and feed them back into the fine-tuning process, aligning the model more closely with our institutional data and logic.

This iterative cycle of fine-tune, monitor, and retrain is essential. It is how accuracy improves, trust strengthens, and context deepens. In practice, our goal is not perfection but reliability, a model that gets most answers right the first time and learns responsibly from the ones it gets wrong. That is what transforms AI from an experiment into an evolving institutional partner.

Lesson 6: Human-in-the-Loop Is Not Optional

As we moved from information retrieval to action, new ethical and procedural questions emerged. Could AI one day suggest course adjustments, assist in academic planning, or guide an advisor through a complex workflow? Technically yes, but only under human supervision.

We decided that humans must either provide oversight to individual transactional scenarios or approve bulk transactions that AI can perform. AI can propose an action, but a person has to confirm it. We call it the AI to Human Handshake, and it is a guiding principle in an AI Advisor application we are building. In this application, AI performs the background work that otherwise takes hours of manual effort. It can retrieve information from multiple systems, check schedules, cross-reference student progress, and prepare suggested actions. But it always pauses for human confirmation before anything is finalized. The advisor sees a clear summary of what the AI intends to do, reviews it, and decides whether to proceed. That step transforms automation into assurance. The simple goal is to let AI handle the repetitive and technical tasks so that advisors can focus on judgment, context, and conversation. This design keeps control within the human realm while freeing people from routine lookups, list uploads, and data checks. The handshake ensures that the system is efficient without losing accountability.

For our course information retrieval chatbot, we held focus groups to understand how students and staff interacted with AI when searching for course information. Those sessions helped us observe how users phrased questions, what they trusted, and where confusion arose. The insights from those conversations now inform the design of the AI Advisor as well, especially in making instructions clear, boundaries visible, and responses transparent.

Many users still see AI as a threat, something that might replace their expertise. When they experience these systems and see that AI amplifies rather than replaces their role, that perception changes. Transparency, control, and education turn skepticism into curiosity and curiosity into trust.

In the end, the human-in-the-loop is not just a safeguard. It is the foundation of this entire approach. The AI to Human Handshake captures the spirit of responsible innovation: technology that works alongside people, not in place of them.

Lesson 7: Culture Is Infrastructure

The more we worked with AI, the clearer it became that technology reveals culture. Every system we built surfaced not just data challenges but organizational ones. Questions that seemed technical often turned out to be cultural. People would ask, “Why can’t I just download the data into Excel?” or “Can I still use the traditional reports?” or “Why can’t I do it my way?” These were not questions about access or skill. They were questions about culture change.

The AI Advisor app and the course information chatbot both showed us that adoption is not only about accuracy or performance. It is about trust, habits, and shared understanding. Technology may introduce new capabilities, but people decide whether those capabilities actually take root. We also realized that our governance structures will need to change as AI becomes more embedded in daily work. Traditional data governance focuses on accuracy, access, and security, but AI introduces dimensions such as how models reason, how they explain themselves, and how people interpret their responses. Managing these dimensions requires a cross-functional AI governance group that can bring together technical, academic, and administrative perspectives. This new kind of governance is not static. It will keep evolving as the technology and expectations evolve. Policies written today may need revision tomorrow. The goal is not to stifle AI but to learn with it, to make sure that each advancement aligns with our institutional values and shared sense of responsibility.

At the same time, culture will only change if AI is embedded into people’s actual work. If the tools meet them where they are—inside the systems, workflows, and moments that already define their day—then adoption feels natural, not forced. When AI reduces friction and gives people the ability to act faster, see more clearly, or make better decisions, it empowers them rather than replaces them. That is the kind of change institutions need.

Ultimately, culture is what allows that transformation to happen. It is the foundation that determines whether AI strengthens the institution or simply exposes its fault lines. The stronger the culture of openness, collaboration, and learning, the readier the institution will be for whatever AI becomes next.

Lesson 8: Tools and Platforms Are Incorporating AI

Every vendor now wants to claim an “AI story.” From LMS providers to ERP vendors to CRM platforms, everyone is embedding AI into their products. While this trend accelerates innovation, it also fragments governance. Each system begins to host its own “intelligent” layer, sometimes trained on institutional data, sometimes on external sources, often with its own log. Some vendors allow users to query curated data models using natural language, which solves many interface problems but still requires us to encode definitions accurately. Very few vendors offer robust fine-tuning capabilities to adjust their AI based on our data.

This proliferation creates a new kind of risk: AI silos. What if each vendor’s tool interprets the same institutional data differently? A faculty dashboard might say 18,000 enrolled students, whereas an advising chatbot might report 17,500, depending on what filters are used and what data are fed into that system. Neither is wrong within its own schema, but they are inconsistent institutionally. The root cause is not technology but the absence of flexibility in fine-tuning the AI in those platforms. When every platform carries its own embedded AI, the integrity of results depends on the clarity of the data feeding them. Without consistent definitions, “AI in the platform” simply automates confusion faster.

Lesson 9: Governance Must Evolve as Fast as the Technology

Traditional data governance frameworks assume slow change, but AI changes daily. Policies written for static systems cannot keep up with generative models that learn new behavior with every release. What we do today may be obsolete tomorrow. Model updates can alter overnight how an AI interprets queries or actions. A policy that works for one version may fail for the next. The same goes for vendor tools. Today’s “AI-enabled module” may rely on one reasoning layer; tomorrow’s upgrade could integrate an entirely new model architecture.

To manage this pace, we need an adaptive governance model. Each AI use case goes through a short-form risk and compliance review: What data are being accessed? What definitions apply? What audit trail exists? These reviews are revisited quarterly, not annually. Governance at Rowan became a living process, not a one-time approval. This flexibility has been essential. We no longer think of governance as documentation; we think of it as version control for trust.

As we continue to learn, one challenge stands out: teaching AI to recognize ambiguity and ask follow-up questions instead of confidently producing the wrong answer. When the AI misinterprets a query, we log it, and each “wrong” answer becomes a governance artifact, a record of where human assumptions met machine literalism. One particularly memorable incident involved a request for “average aid per student.” The AI included loans, grants, and work-study, inflating the figure. That misstep led us to separate “aid awarded,” “aid accepted,” and “aid disbursed” into distinct definitions, an improvement that benefited every reporting process, not just AI. When a user asks for something unclear such as “average aid” or “student performance,” a human would naturally ask, “Do you mean by department? By term? Only institutional aid?” Our AI, however, still tends to make assumptions and move forward. Getting it to pause, clarify, and confirm intent will be critical for building trust and accuracy. We are not there yet, but these moments of misunderstanding are showing us what the next phase of learning needs to look like—for the AI and for us.

Lesson 10: Leadership Matters More than Code

Perhaps the most important lesson is organizational. AI initiatives succeed when leadership treats them as institutional transformations, not pilots. Every institution that embarks on this journey needs visible executive sponsorship, clear ethical principles, and communication that connects AI adoption to mission and values.

Leadership also needs to navigate the vendor ecosystem with caution. The marketplace moves faster than policy can follow, and every new “AI powered” release tempts institutions to chase capabilities rather than coherence. The role of leadership is to articulate a unifying vision of what AI should do for the institution, not what each product can do in isolation. At Rowan, we found progress when leaders viewed AI through the lens of stewardship. The question was never, “Can we do this?” but “How do we do this responsibly, equitably, and sustainably?” That mindset turned compliance into confidence.

The Path Ahead: Still Early Days

For all the progress we have made, the feeling is less that we have arrived and more like we have just cleared the starting line. Every breakthrough in AI brings a new challenge, and every safeguard we design today may need revision tomorrow. The truth is that AI in higher education is still in its infancy. We are building frameworks for technologies that evolve faster than our policy cycles, and we are learning in real time what it means for machines to interpret institutional data, policy, and process. The work ahead is not only about coding models or securing APIs but about reinventing institutional infrastructure around clarity, stewardship, and adaptability. The hardest part is not deploying AI but keeping it accurate, accountable, and aligned with our values as the ground shifts beneath us. If one insight stands above the rest, it’s that AI does not solve governance but exposes it. AI reveals every missing definition, every undocumented rule, every assumption that lives quietly in someone’s inbox or memory. That exposure is uncomfortable, but it is also an invitation to build stronger systems, clearer language, and a culture of shared responsibility for data-driven decision-making. AI will not replace human judgment; it will test it. Our job in higher education is to make sure that test strengthens our institutions rather than undermines them.

We are still at the beginning. But it is a beginning worth getting right.

Acknowledgment

We want to express our heartfelt appreciation to Jackie Ring, Vice Chancellor and Chief Institutional Research Officer at Rowan University, for reviewing this article and providing invaluable feedback.


Bharathwaj Vijayakumar is Assistant Vice President, Institutional Data & Analytics, at Rowan University.

Samyukta Alapati is Director, Institutional Data & Analytics, at Rowan University.

© 2026 Bharathwaj Vijayakumar and Samyukta Alapati. The content of this work is licensed under a Creative Commons BY-NC-SA 4.0 International License.